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Thai Jasmine rice or Thai Hom Mali rice is a well-known rice type that 
originated in Thailand. Rice grain qualities are important in determining 
market pricing and are used in grading systems. The purpose of this research 
is to use machine learning and deep learning to improve the grading of Thai 
Hom Mali rice following standardized grading criteria. The appearance of 
grains and foreign items will determine the grade of rice. The experiment 
has two parts: grain categorization and rice grading. Multi-class support 
vector machine (SVM) and convolutional neural network (CNN) are 
proposed. There are 15 features used as input for multi-class SVM, including 
morphology and color features. With ImageNet pre-trained weights, CNN 


Grading with DenseNet201 architecture is implemented. The experiment also tested 
Multi-class support vector into how CNN worked with both original and preprocessed images. The 
machine results are then compared to a neural network (NN) baseline approach. The 


CNN approach, which identified each rice variety using preprocessed 
images, archieved the greatest accuracy rate of 98.25%, with an average 
accuracy of 94.52% across six categories of rice grading. 
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1. INTRODUCTION 

Rice has been a staple food of the world for thousands of years. In Asia, there are numerous rice 
varieties. Jasmine rice is regarded the greatest rice in Thailand, and it is growing increasingly popular among 
rice consumers all over the world. Thai jasmine rice or Thai Hom Mali rice [1], [2] is valued on the market 
based on a variety of factors. Some of these properties include texture, shape, color, and fracture rate. 
Manually evaluating or categorizing grains by human visual inspection is time-consuming, inconsistent, and 
restricted to the evaluator's experience due to the human factor. Computer-based with image processing 
techniques are being used to grading seeds automatically, which helps to speed up and improve the accuracy 
of the process. Rice quality grading is an important aspect of the processes used in the rice-producing 
businesses to evaluate rice quality and to define rice pricing in the commercial market [3], [4]. 

Several researches have addressed machine learning algorithms for evaluating rice grains [5]-[10], 
but none have discussed rice grading for exporting. For export, Thai Hom Mali rice is separated into two 
categories [1]: white rice and white broken rice. White rice is divided into four grades: 100% white rice, 5% 
white rice, 10% white rice, and 15% white rice. White broken rice is divided into two grades: Al extra super 
white broken rice and Al super white broken rice. Each rice grade is identified by the presence of whole 
kernels, broken kernels, red kernels, yellow kernels, chalky kernels, damaged kernels and undeveloped and 
immature kernels. Each grade of rice has different properties depending on what is contained in the rice. The 
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present of whole kernels, broken kernels, yellow kernels, chalky kernels, damaged kernel and undeveloped 
and immature kernels are used to grade the white rice. While the present of small while broken cl is used to 
grade the white broken rice. Tables | and 2 represent the white rice standard and white broken rice standard, 
respectively. The definition of each composition is described in Table 3. 


Table 1. The standards of Thai Hom Mali white rice 


Grade Grain composition (%) 
Whole Broken Red Yellow Chalky Damaged Undeveloped and 
kernel kernel kernel kernel kernel kernels immature kernels 
White rice 100% >60 <0.50 <0.50 <0.20 <3.00 <0.25 <0.20 
White rice 5% >60 <0.50 <2.00 <0.50 <6.00 <0.25 <0.30 
White rice 10% 255 <0.70 <2.00 <1.00 <7.00 <0.50 <0.40 
White rice 15% >55 22.00 <5.00 <1.00 <1.00 <1.00 <0.40 


Table 2. The standards of Thai Hom Mali white broken rice 


Grade Grain composition (%) 
Whole kernels Small broken cl 

Al extra super <15 <1 

Al super - <5 


Table 3. Definition of kernel’s types 


Term Definition 
Whole kernel Rice kernels that have no broken portion 
Broken kernel Kernel fragments that remain less than 80% of the total area 
Red kernel rice kernels that are completely or partly covered in reddish colour 
Yellow kernel rice kernels with a yellow colour 
Chalky kernel Rice kernels that have area of opaque greater than 50% 
Damaged kernels Damaged rice kernel that are clearly visible to the eyes due to insect or other factors 
Undeveloped and immature kernels Rice kernel that has not fully developed and are flat 
Small broken cl small pieces of rice that pass-through sieve no. 7 


There were a number of researches about rice classification using image processing [11], [12]. Many 
techniques were used such as k-nearest neighbors (KNN) classifier [13], [14], support vector machines (SVMs) 
[15], [16], neural network (NN) [17]-[21] and convolutional neural network (CNN) [22]-[24]. The most of the 
research based on grain separation features such as morphology, shape, color, and texture. Koklu et al. [18] used 
NN and CNN to classify fours variety of rice. The artificial neural networks technique was used to define the 
rice seed germination evaluation system by Lurstwut and Pornpanomchai [19]. The system used 18 features of 
shape, textural and color. The false positive rate was 7.66%. Mavaddati [21] proposed rice grain quality 
detection using principal component analysis and model learning. Jiang ef al. [22] used deep learning and SVM 
to recognize rice leaf diseases. The accuracy was 96.8%. Lin et al. [23] proposed three rice species classification 
using CNN. The accuracy rate was 95.5%. Kuo ef al. [25] implemented sparse representation-based 
classification to identify 30 varieties of rice grain. The images are captures by microscopy. Shape texture and 
color properties are used with 89.1% accuracy. SVM [16], [26] was used to classify rice with 86% and 92.22% 
accuracy rate corresponding. The RiceNet: CNN based [27] was used to classified Pakistani rice types. Region 
proposals-based CNN [28] was used to localized and classified rice types. In the literatures, CNN has 
demonstrated superior performance and benefits over other machine learning algorithms. High accuracy comes 
at the expense of a number of machine resources and a large data set. 

The rice seed quality is one of the most major determinants in the rice trade. Rice grain sample is 
performed to determine the rice's quality. It's a technique for getting a random sample of grains that is 
representative of all grains in order to assess the overall quality of a pie. After that, the quality of each grain 
of rice will be evaluated. According to the literature review, there have been researches to determine the 
types of rice, but no research focused on rice quality grading for export have been performed. The objective 
of the study is to enhance the classification of rice grain and grading evaluation for export from Thai Hom 
Mali rice images based on multi-class SVM and CNN techniques. Multi-class SVM with library for SVM 
(libSVM) uses morphology and color features as inputs. The CNN architecture employd is DenseNet201. In 
addition, we performed experiments on CNN using both pre-processed and non-preprocessed images to 
compare the performance. 
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2. METHOD 

In this paper, we suggested an autonomous rice grading system using machine learning and deep 
learning techniques. A sampling multiple grain image is used to identify a rice grade categorization. Firstly, 
Multi-class SVM and CNN are proposed for single rice grain categories. Then, six rice grades are classified 
from grain sampling image. The accuracy, time, and resource allocation of the two approaches were 
evaluated in terms of operational outcomes. Baseline NN approach is also tested and compared. The 
proposed methods are comprised of pre-processing step, rice segmentation using watershed algorithm, 
feature extraction, rice classification and rice adulteration detection. The system's flowchart is shown in 
Figure 1. 
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Figure 1. The proposed system's flowchart 


2.1. Datasets 

The dataset of Thai Hom Mali rice species 105 images used in this paper are taken by rice expert. A 
total of 8,000 single rice grain photos have been used, with 1,000 of each type. There are 3,000 images of 
rice grains in total, with 500 images for each grade. Single grain rice image resolution is 140x140 pixels. The 
images are captured using an ordinary mobile phone camera. The camera was placed 10 inches above the rice 
plate. Multiple grains image resolution is 2,500x2,000 pixels. The sample grains, which range from 100 to 
130 grains of rice, are scattered out in a random manner. The image of rice grain is shown in Figure 2. The 
image of a single rice grain with complete kernel, fractured kernel, red kernel, yellow kernel, chalky kernel, 
damaged kernel, undeveloped and immature kernels, and little broken cl are shown in Figures 2(a) to 2(h), 
respectively. The image of multiple rice sampling is shown in Figure 2(i). 


(b) 


(e) (f) (g) (h) 
Figure 2. Rice grain (a) whole kernel, (b) broken kernel, (c) red kernel, (d) yellow kernel, (e) chalky kernel, 


(f) damaged kernel, (g) undeveloped and immature kernels, (h) small broken cl, and (i) multiple rices 
sampling 
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2.2. Image preprocessing 

Image preprocessing step comprises of image enhancement and image segmentation. To increase 
the image's quality, median filtering method with mask size 3x3 is used to remove noise and histogram 
stretch is used to enhance the contrast. Then the image is binarized using Otsu method. Canny edge detection 
is then applied. 


2.3. Rice segmentation using watershed algorithm 

In the multiple rice kernel sampling image, some adjacent kernels can be seen. The adjacent kernel 
must be calculated independently for purpose of detection. The watershed transform [29] is used to separate 
two connected rice kernels. The watershed algorithm is based on mathematical morphology, which is 
concerned with an image's topographic representation. The image is addressed as a topographical layer in 
which the greyish value of each pixel in the input image is indicated by the elevation at each position of the 
surface. A reduced elevation on the surface is represented by a darkened pixel. The divisions’ borders are 
defined by the watersheds. Two overlapping rice grains are to be separated along a watershed line. The result 
of segmentation is shown in Figure 3. Figure 3(a) shows two connected rice. Figure 3(b) showns the 
watershed applied and the disconnect rices’s result is shown in Figure 3(c). 


—, 


(a) (b) (c) 


Figure 3. Watershed segmentation (a) two connected rice, (b) watershed, and (c) disconnect rices 


2.4. Feature extraction 

The images after preprocessed and segmented are used to extract the features. Morphology and 
color features are used. There are 15 features in the feature extraction step: 9 morphology features and 6 
color features. Rice morphology feature can be used to distinguish whole grain, broken grain, damage grain 
and undeveloped grain. While color feature can be used to distinguish red grain, yellow grain, and chalky 
grain. The feature used are displayed in Table 4. 


Table 4. Feature’s description 


Features Description 
Morphology 
1. Area The number of the pixel in rice grain area 
2. Perimeter The perimeter of rice grain 
3. | Roundness The circularity of rice grain as shown in (1) 
4. Eccentricity The distance between the ellipse's foci divided by the length of main axis 
5. Major axis’s length The rice grain length 
6. Minor axis’s length The rice grain width 
7. Aspect ratio The ratio of the major axis to minor axis as shown in (2) 
8. | Equivalent diameter The diameter of a sphere with properties equivalent to the object as shown in (3) 
9. Convex area The area of the convex hull on the kernel 
Color 
1. Average of red color The percentage of red color 
2. Average of green color The percentage of green color 
3. Average of blue color The percentage of blue color 
4. Average of hue color The percentage of hue color 
5. Average of saturation color The percentage of saturation color 
6. Average of intensity color The percentage of intensity color 
perimeter? 
Roundness = ————_ (1) 
4xmxarea 
Aspect ate = length of ae a (2) 
length of minor axis 
Equivalence diameter = = (3) 
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2.5. Rice classification 

To classify single rice grain, a total of 8,000 images are used in the experiment, including 1,000 
images of each type of grain. Images are employed for training and testing in percentages of 80% and 20%, 
respectively. The CNN and the multi-class SVM are employed. 

A SVM [30] classifies data through projecting input vectors into a higher-dimensional space and 
constructing a hyperplane that separates data in that space properly. libS VM [31] is used. SVM employs all 
of the features as input. The radial basis function is the kernel function of the SVM employed in this work. 
By varying the penalty value C and the kernel function parameter y, the accuracy is examined. The optimal 
SVM parameter is determined using grid search. As SVM is a binary classification, multi-class SVM 
classifiers are created by combining several single SVM classifiers to categorize all rice types. 

CNN [13], [31]-[34] is a deep learning approach that performs feature extraction, pattern 
identification, and classification using multiple layers of non-linear information processing. The multi-layer 
perceptron (MLP) is used to classify the image. DenseNet201 architecture with ImageNet pre-trained weight 
is used in CNN. In the training of algorithms, a 10-fold cross validation value was applied. Input layer, 
convolutional layer, pooling layer, flatten layer, and fully connected layer compensate a CNN. The feature 
will be selected automatically by the CNN as an input layer. The convolutional layer generates a feature map 
or kernel that can be scanned throughout the image and applied to the input image. The pooling layer lowers 
the size of the output from the preceding layer while preserving as many data attributes as feasible. To 
prepare data for entry to a fully connected layer, the flatten layer transforms multidimensional output data to 
one dimension. Data from every input is connected to every output node, each connection is multiplied by a 
different weight, and every output node can assign an appropriate activation. The CNN architecture is shown 
in Figure 4. 
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Figure 4. The proposed CNN architecture 
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2.6. Rice grading 

To determine the rice's quality, a multi-grain image is employed. Each grain is extracted into a 
single grain using watershed segmentation. Each grain is then tested using a classification model to determine 
which class it belongs to. The proportion of grains detected is statistically computed using component 
standard table to determine the grade of rice. 


3. RESULTS AND DISCUSSION 

The experiment is separated into two parts: single grain classification and rice grading. The 
accuracy is calculated, as in (4). The accuracy is calculated by dividing the total amount of data by the 
number of correct classifications. True positive (TP) and true negative (TN) data is a positive and negative 
data hat has been correctly classified. False positive (FP) and false negative (FN) data is positive and 
negative data that has been misclassified. 


TP+TN 
TP+TN+FP+FN 


Accuracy = (4) 
Three classification result included multi-class SVM, NN and CNN are compared. CNN's 

performance with both original and preprocessed images was also investigated in the experiment. 

The CNN technique used preprocessed images has the greatest accuracy rate of 98.25%. The result from 


CNN is then used in the next grading experiment. The result image of rice grading was shown in Figure 5. 
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The multiple grains image is shown in Figure 5(a). The detected grains and classified grains are shown in 
Figure 5(b) and Figure 5(c), respectively. Tables 5 and 6 illustrate the accuracy rate of single grain 
categorization and rice grading, respectively. Average accuracy of rice grading is 94.52%. 


(b) 


Figure 5. Rice grading result (a) multiple grains, (b) detected grains, and (c) classified grains 


Table 5. The result of single grain categorization 


Method Whole Broken Red Yellow Chalky Damaged Undeveloped and Small Average 
kernel kernel kernel kernel kernel kernels immature kernels broken cl 

CNN 95.12 96.68 96.87 95.23 96.67 96.58 95.55 97.31 98.25 

+preprocessed 
image 

CNN 94.25 92.77 91.24 90.25 91.55 92.87 93.54 95.27 92.72 
+original image 

Multi-class SVM 91.52 90.12 90.11 90.50 90.11 91.54 90.25 91.27 90.68 

NN 82.11 81.24 83.24 85.21 84.36 81.58 89.21 80.24 83.40 


Table 6. The result of rice grading 


Type Rice 100% —_Rice5% Rice 10% __ Rice 15% Al extra super Al super 
Rice 100% 92.50 2.50 3.00 2.00 0 0 
Rice 5% 2.00 93.25 212 2.38 0 0 
Rice 10% 1.75 3.25 93.75 1.25 0 0 
Rice 15% 1.21 2.31 1.91 94.57 0 0 
Al extra super 0 0 0 0 97.50 2.50 
Al super 0 0 0 0 4.43 95.57 


There are detection mistakes, as can be observed from the results. The major issue occurs as a result 
of the seeds' comparable sizes. The size of several grains is misrepresented due to the light and shadows. For 
example, damage kernel and underdeveloped kernel are essentially identical, resulting in an inaccurate 
detection. These issues can be solved by increasing the amount of training data sets in varying photography 
condition. The result shown that deep learning technique with CNN outperforms machine learning with 
multi-class SVM, but it's crucial to note that increased accuracy comes at a cost of more computing power. 
Another consideration is that whereas multi-class SVM requires a high number of computational features, 
CNN can work with image itself. When comparing time complexity, NN is the least complex with O (1), 
multi-class SVM is the next order with O(n*), and CNN is the most complexity. Trade-off between accuracy 
rate and computing power is a preference based on the performance of the available devices. 


4. CONCLUSION 

Single rice grain classification is an important part of rice grading for determining milling quality 
and rice export. The multi-class SVM and CNN is used in this experiment. A total of 15 features are provided 
as input for multi-class SVM, while CNN extracting features automatically. According to the results, using 
CNN with preprocessed images showed better results than using the originals. Rice grading will be improved 
by increasing grain categories. Although CNN is more accurate than multi-class SVM and NN in terms of 
accuracy, it takes longer to process and uses a significant amount of system resources. Machine learning 
works well with small to medium datasets, whereas deep learning required a large dataset. In image 
recognition, deep learning is particularly effective for automatic feature extraction across several layers of the 
network. The proposed approach might be applied in smart phones, allowing local producers to verify rice 
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quality and then improve the quality of their own output to increase sales prices and improve quality of life. 
In the future work, large-scale datasets are still required for an efficient deep learning system. The 
combination of deep learning and feature extraction should be investigated. Other deep learning architecture 
like 3CNNs, MobileNet, or InceptionResNetV2 should be experimented further. 
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